Relation Extraction for Ontology Construction
نویسنده
چکیده
This proposal is for a programme of work leading to the building of a data querying application. The starting point is a collection of relational databases holding cultural heritage material from the National Collections of Scotland. The data is a mixture of fixed fields and free text, supported by background material such as domain thesauri. The goal is to produce a system for running queries against this material that does not assume the user has expert knowledge of the data structure or the specialist domain terminology. Two core tasks are proposed as necessary steps towards the goal: • Extraction of two-place relations from free text: The purpose of this step is to translate key facts from the textual material into a standardised format. A combination of rule-based and machine learning approaches is planned. • Automatic assembly of all relevant data into an ontology: An ontology is defined here as a graph of two-place relations where the edges represent predicates and the nodes the entities they apply to. All of the relevant information — from database fields, domain thesauri and the extracted textual relations — will be combined into such a graph. The application will run against the populated ontology. An interactive interface is proposed, in which a tailored summary based on the user’s initial query is generated and the user is then invited to refine the query based on this information. Evaluation of each subtask is planned, and the overall criteria for success will be that query performance is comparable to what is currently available in terms of speed and range, whilst also producing improved results for the non-expert user.
منابع مشابه
Semi-automatic Domain Ontology Construction from Spoken Corpus in Tunisian Dialect: Railway Request Information
In this paper, we present a hybrid method for semi-automatic building of domain ontology from spoken dialogue corpus in Tunisian Dialect for the railway request information domain. The proposed method is based on a statistical method for term and concept extraction and a linguistic method for semantic relation extraction. This method consists of three fundamental phases, namely the corpus const...
متن کاملAutomatic Thai Ontology Construction and Maintenance System
Ontology is an essential resource to enhance the performance of Information Processing system such as information integration, document classification in taxonomies, including information retrieval and data cleaning in database system. This paper proposes three methodologies for Automatic Thai Ontology Construction and Maintenance from technical corpus, dictionary and thesaurus. For corpus base...
متن کاملAutomatic Wayang Ontology Construction using Relation Extraction from Free Text
This paper reports on our work to automatically construct and populate an ontology of wayang (Indonesian shadow puppet) mythology from free text using relation extraction and relation clustering. A reference ontology is used to evaluate the generated ontology. The reference ontology contains concepts and properties within the wayang character domain. We examined the influence of corpus data var...
متن کاملRelation Extraction Based on AGROVOC
Relation extraction is an important step in Ontology construction. This paper provides a method to extract relations among conceptions with AGROVOC.
متن کاملTaxonomic Relation Extraction from Wikipedia: Datasets and Algorithms
The dynamic and continuously growing category structure of Wikipedia has been used in numerous ontology extraction methods. We present a dataset of category subgraphs automatically extracted from Wikipedia that are manually annotated for is-a and instance-of relations in order to enable a more comprehensive evaluation of taxonomy mining approaches. We also show how the new dataset can be used w...
متن کاملCreation of a bottom-up corpus-based ontology for Italian Linguistics
This paper describes the steps of construction of a shallow lexical ontology of Italian Linguistics in Italian, set to be used by a metasearch engine for query refinement. The ontology was constructed with the software Protégé 4.0.2 and encoded in OWL format; its construction has been carried out following the steps described in the well-known Ontology Learning From Text (OLFT) layer cake. The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006